Regression for Sentence-Level MT Evaluation with Pseudo References
نویسندگان
چکیده
Many automatic evaluation metrics for machine translation (MT) rely on making comparisons to human translations, a resource that may not always be available. We present a method for developing sentence-level MT evaluation metrics that do not directly rely on human reference translations. Our metrics are developed using regression learning and are based on a set of weaker indicators of fluency and adequacy (pseudo references). Experimental results suggest that they rival standard reference-based metrics in terms of correlations with human judgments on new test instances.
منابع مشابه
The Role of Pseudo References in MT Evaluation
Previous studies have shown automatic evaluation metrics to be more reliable when compared against many human translations. However, multiple human references may not always be available. It is common that automatic metrics must make judgments based on a single human reference (extracted from parallel texts) or no reference at all. Our earlier work suggested that a promising way to address this...
متن کاملAutomatic Post-Editing based on SMT and its selective application by Sentence-Level Automatic Quality Evaluation
In the computing assisted translation process with machine translation (MT), postediting costs time and efforts on the part of human. To solve this problem, some have attempted to automate post editing. Post-editing isn’t always necessary, however, when MT outputs are of adequate quality for human. This means that we need to be able to estimate the translation quality of each translated sentenc...
متن کاملGlobal Source-Aware Statistical Post-Editing for General MT: Sentence Specification via Pseudo-Feedback
The automatic post-editing (APE), which can correct the translation errors, is an effective approach to improving machine translation (MT) output quality. This paper proposes a global source-aware SPE model to improve the MT translation quality leveraging pseudo-feedback to achieve the sentence specification. For a given source sentence, some similar sentences are retrieved from a translation m...
متن کاملTowards Optimizing MT for Post-Editing Effort: Can BLEU Still Be Useful?
We propose a simple, linear-combination automatic evaluation measure (AEM) to approximate post-editing (PE) effort. Effort is measured both as PE time and as the number of PE operations performed. The ultimate goal is to define an AEM that can be used to optimize machine translation (MT) systems tominimize PE effort, but without having to perform unfeasible repeated PE during optimization. As P...
متن کاملDocument-level translation quality estimation: exploring dicsourse an pseudo-references
Predicting the quality of machine translations is a challenging topic. Quality estimation (QE) of translations is based on features of the source and target texts (without the need for human references), and on supervised machine learning methods to build prediction models. Engineering well-performing features is therefore crucial in QE modelling. Several features have been used so far, but the...
متن کامل